Monte Carlo Approach to the Evaluation of Nanoparticles Size Distribution from the Analysis of UV-Vis-NIR Spectra

How nice would it be to obtain the size distribution of a nanoparticle dispersion fast and without electron microscope measurements? UV-Vis-NIR spectrophotometry offers a very rapid solution; however, the spectra interpretation can be very challenging and needs to take into account the size distribution of the nanoparticles and agglomeration. This work suggests a Monte Carlo method for rapid fitting UV-Vis-NIR spectra using one or two size distributions starting from a dataset of precomputed spectra based on Mie theory. The proposed algorithm is tested on copper nanoparticles produced with Pulsed Laser Ablation in Liquid and on gold nanoparticles from the literature. The fitted distribution results are comparable with Transmission Electron Microscope results and, in some cases, reflect the presence of agglomeration.


Introduction
Nowadays, nanoparticles (NPs) are used in a wide range of applications and devices thanks to the combination of classical and quantum effects in optical, conductive, magnetic properties, and so forth.In particular, copper and gold NPs are widely used in plasmonic devices due to their optical properties in the visible light range.The UV-Vis-NIR spectra of Cu and Au NPs present a typical peak, called plasmonic peak, in a specific wavelength range, depending on the material, that can be used for various devices and applications [1][2][3][4][5][6][7][8][9][10].
The plasmonic peak position and shape strictly depend on the NPs size and shape [11,12], so knowing the particle size distribution is very important.Transmission Electron Microscopy (TEM) analysis gives detailed information about the particle size distribution but does not give appreciable information on the aggregation status, and both the sample preparation and analysis require a long time [13].Also, sample preparation can alter it, especially for copper and other non-precious metals that can oxidize.Scanning Electron Microscopy (SEM) analysis is faster, and the sample preparation is easier, but a size resolution under ∼10 nm is commonly hard to obtain.UV-Vis-NIR spectrophotometry can perform rapid measurements of particle suspension without altering the sample.In particular, the extinction cross section (given by absorption plus scattering) is related to the particles' shape and size through Mie scattering theory [11,12,14].A fitting routine that uses Mie scattering is not easy to implement, and for the few works present in the scientific literature, everyone considers a constant refractive index for the solvent.An efficient fitting routine is proposed in Refs.[13,15] for monodisperse spheroidal gold NPs using the Mie-Gans model.Other approaches of fitting or simulation are proposed in the scientific literature mainly for gold monodispersed NPs or spheroids, like Machine Learning [16], Least Squares Approximation combined with matrix formalism [17], Discrete Dipole Approximation [18] and Effective Medium theories for NP clusters [19].
The aim of this work is to develop a versatile computational method to easily extract metal NPs size distribution from an optical spectrum using a fast Monte Carlo technique (in terms of the number of iterations) that can work both with monodisperse and polydisperse nanoparticles in a wide range of wavelengths (250-1100 nm).In particular, a dataset will be previously computed to fit the experimental data using some physical considerations as an alternative to a mathematical or Machine Learning approach.Firstly, the code is first used on experimental spectra of some copper NPs dispersions produced with Pulsed Laser Ablation in Liquid (PLAL) and then on the gold literature data.

Experimental Section
Copper NPs were synthesized using an Nd:YAG ns-pulsed laser (Quanta-Ray PRO-Series Nd:YAG with wavelength λ = 1064 nm, pulse length = 12 ns, mean power = 5 W, and repetition rate = 10 Hz) with the methodology described in Ref. [20].A lens (focal length of 10 cm) focused the laser beam on a copper target at the bottom of a Teflon vessel, filled with 8 mL of liquid (acetone, methanol, and ethanol).The ablated mass was measured with a Sartorius M5 microbalance (sensitivity 0.01 mg) by weighting the target before and after the ablation, resulting, respectively, in 0.07 mg, 0.13 mg, and 0.70 mg for acetone, methanol, and ethanol with an accuracy of 0.02 mg.
The obtained NPs colloidal solutions were sonicated for 15 min and optically analyzed with a PerkinElmer LAMBDA 1050+ UV-Vis-NIR Spectrophotometer, measuring the absorbance from 200 nm to 1100 nm.A baseline correction was performed using the measured absorbance of the relative solvent for each solution.

Computational Section
Computational analysis is performed using two codes developed on Wolfram Mathematica 13 software [21].The overall process is schematically represented in Figure 1.The first code (Dataset_creation.nb,described in Appendix C), uses the results of the Mie scattering theory to create a dataset containing the spectrum of spherical particles having a different size.The starting point for these simulations is the material's and medium's refractive index, which can be easily found in an online database [22] or in the Palik Handbook [23].Unfortunately, more than ten refractive indexes for copper are available in the literature, in the visible range, with slight differences among them.So the copper refractive index was experimentally evaluated (see Appendix A).This code should work also in the UV region, where the refractive index of the solvents cannot be considered a constant, so their formulas are taken from the online database [22] and reported in Appendix B. A database is computed for each solvent for various radii and wavelength ranges.
The second code comes in two versions: one is for monosdispersed NPs colloidal solution (Mono_Fitting.nb) and the other is for polydispersed ones (Poly_Fitting.nbalgorithm presented in Figure 2).Firstly, both the experimental data and cross section from the dataset are acquired.The experimental data are usually reported in arbitrary units, and the computed cross sections present values of the order of magnitude less than 10 −13 m 2 .Working with these values is computationally inconvenient; for this reason, they will be rescaled.The total absorption spectrum of a given particle distribution f (r) is obtained by integrating over the full particle radius in the range, but when computed, the integral must be discretized: The denominator ∑ r i f (r i ) does not depend on the wavelength and acts like a scale parameter, so it is ignored to lighten the computational burden.The colloidal solutions usually are characterized by monodisperse or polydisperse nanoparticles [24].In the most simple case (monodisperse NPs), the size distribution follows a lognormal distribution: with three parameters θ = {a 1 , µ 1 , w 1 }.Polydisperse solutions present at least two separate size distributions.In this case, a lognormal size distribution for smaller particles is used, while for bigger particles or aggregates, a Gaussian size distribution is chosen.This choice is due to the random particle aggregation process [13].Therefore, the assumption is that the global distribution is: with six parameters θ = {a 1 , a 2 , µ 1 , µ 2 , w 1 , w 2 }.A fitting routine for f (r i , θ) is manually implemented to find the optimal parameter set θ that minimizes the mean squared error (MSE) between the simulation X i and the experimental data Y i .The MSE is defined as: where σ tot contains a sum over all the radii (Equation ( 1)) and the multiplication factor 1/n is ignored.This new quantity ERROR is proportional to MSE and will be minimized.The fitting routine is divided into three steps: 1.
Files reading: Experimental data are acquired, sorted, and normalized.The dataset is acquired at the same wavelengths as the experimental points, and it is also rescaled.This automatically leads to the use of the wavelength range in which both the experimental data and the computed dataset are defined.

2.
Assign starting point parameters: choosing the starting point parameters for a function of three or six parameters is crucial.Starting with some random parameters can lead the gradient to descend toward a local minimum without specific physical significance.It is known that "With four parameters I can fit an elephant, and with five I can make him wiggle his trunk-E.Fermi" [25].To pursue this aim, two strategies are followed: • Monodisperse NPs: The ERROR is evaluated between the experimental data and every spectrum in the dataset.The spectrum that produces the minimum ERROR gives the starting point for the distribution centroid µ 1 and the scale parameter a 1 .This evaluation is performed in a small range (a convenient one can be 400 nm ≤ λ ≤ 700 nm because gold and copper have their plasmonic peak within this range).
• Polydisperse NPs: The ERROR is evaluated between the experimental data and every spectrum in the dataset in two different ranges.Small particles strongly contribute in the UV, so a 1 and µ 1 (lognormal distribution) are assigned by finding the minimum ERROR among the computed spectra for λ ≤ 350 nm.Bigger particles and aggregates strongly contribute in the IR, so a 2 and µ 2 (Gaussian distribution) are assigned by finding the minimum ERROR among the computed spectra for λ ≥ 700 nm.These edge values for λ are purely indicative and can easily be changed in the code to find the optimal starting point for each sample.The initial values of w 1 = 0.5 and w 2 = 3 are assigned arbitrarily.

3.
Monte Carlo step: A cycle where a new set of parameters θ is randomly generated each time within a range of the initial parameter.Whenever the ERROR obtained with the new set of parameters is lower than the initial ERROR, the parameters are updated, and the process is repeated for a fixed number of iterations, but new parameters can now vary in a smaller range than the previous one: where Random[−1, 1] indicates a random number generated between −1 and 1, and Range = 1/(2 + Count) is the range in which the new parameter is generated, with Count = 0 that increases at every successful parameter update (the symbol := is used to indicate a variable update).A visual representation of this process is given in Figure 3.The relative error associated with each parameter is given by 1/ √ m, where m is the number of iterations.At the end of the cycle, a plot and a text file are exported.This approach of separating the theoretical computation (first code) and the curve fitting (second code) represents an alternative to the methods proposed in the literature [13,15], where the cross section is computed time by time.Computing the dataset separately from the fitting allows to use the dataset multiple times without recomputing it.The dataset computing time and the use of computational resources depend on the wavelength range and the radius range.In this work, the Mie scattering theory is used to compute the dataset instead of the Gans theory used in the other work because this code is supposed to work with particles of a radius up to 250 nm (see Ref. [11] Section 9.1.2and Ref. [12] Section 2.1.4.a).The calculations regarding Mie scattering are more complex concerning the ones of Gans scattering and so require more time and CPU, but allow to work without NPs size limitation, and the calculations are computed only once, lightening the computational burden in the long term.As compared to Machine Learning [16], this Monte Carlo approach allows to extract a size distribution and also results in being faster than a classical gradient descent calculation both in terms of time and computational burden.Using gradient descent, every parameter update will contain a sum over all the radii and all the wavelengths will be iterated m times as follows: where α is the learning rate (using the Machine Learning formalism) and this will require at least number_dataset_elements × number_o f _wavelengths × m × j operations.The specific operation is described by Equation ( 6) that contains the derivative of the distribution.This can be evaluated numerically in every cycle or pre-computed, increasing the computational burden.The computational burden of the Monte Carlo fitting depends instead only on some simple mathematical operation (exponentials in distribution evaluation in Equation ( 2) or (3) and squaring in Equation ( 4)) and two summations (over the radii in Equation ( 1) and over the wavelength in Equation ( 4)) iterated m times.The parameter update of Equation ( 5) recalls only a random number generation, so the computational burden depends linearly on the radius range chosen in the dataset, on the wavelength range of the experimental data, and on the chosen number of iterations.Both methods were tested on the same machine (Intel Core i7 11th Gen, 16 GB RAM, data file 351 elements, radius dataset 500 elements): the Monte Carlo method with 900 iterations was executed in only 45 seconds, while the gradient descent required at least 40 seconds each iteration.

Results
The experimental extinction spectra of the copper solutions (dotted lines in Figure 5) present the typical copper plasmonic peak at ∼600 nm [4,11] and a strong absorbance in the UV region that goes to zero in the IR.The spectra of copper NPs produced in methanol and ethanol present also a hint of a large shoulder between 600 and 800 nm and a plateau between 400 and 500 nm, suggesting the presence of agglomerates or particles in the order of 100 nm radius.Bimodal distribution is reported to be intrinsic on the PLAL technique [24] and also a previous study on copper NPs [20] confirms this trend, so with these samples, the bimodal distribution was used.Also, the mono distribution fitting was tested, leading, while the mono distribution fitting lead no appreciable results.Copper NPs produced in acetone were fitted with a mono distribution.
The Monte Carlo algorithm was iterated for 400 or 900 cycles.Figure 4 shows how the error rapidly decreases in the first ∼200 steps and then remains stable for each colloidal solution, except for some fine adjustments, proving that the algorithm is fast and converges and justifying the choice to tighten the range at each iteration.[13]).The points indicate the iteration in which there was a parameter update.The label in the legend indicates both if it refers to a monodispersion or a polydispersion and the used refractive index: "This Work"-Appendix A; "Hagemann"-Refs.[22,26], "Palik"-Ref.[23]; "JohnosnChristy"-Refs.[22,27]; "Olmon"-Refs.[22,28]; "BabarWeaver"-Refs.[22,29].The fitting routine is iterated 400 times on gold NPs and 900 times on copper NPs.All the fitting results are presented in Appendix D.
The wavelength range in which the fit is performed and the initial parameter choice are crucial.Fitting tests were conducted down to 200 nm in wavelength, resulting in values µ 1 around ∼0.1 nm.This value has no physical meaning because it corresponds to the copper atomic radius [30] and may come from some instrument artifacts in the UV region.For this reason, the fitting range was restricted to 250-1000 nm.
Each graph of Figure 4 represents the MSE behavior of four refractive indexes through the algorithm iterations.The fitting of copper NPs produced in methanol and acetone using the copper refractive index evaluated in this work apparently are the worst, even if the obtained refractive index is very close to the literature ones.These discrepancies in terms of final MSE are very low (0.002 for methanol and 0.01 for acetone) and come from the intrinsic differences of the refractive indexes available in the literature.
Figure 5 and the tables in Appendix D report the obtained best-fitting parameters.The fit curves strongly adapt to the data in the Vis-IR region (λ > 600 nm), while the fit is not satisfying in the Vis-UV region.Despite these discrepancies, the algorithm applied on copper NPs produced in methanol converges on µ 1 ∼ 2.5 nm, w 1 ∼ 0.4, µ 2 ∼ 85 nm, w 2 ∼ 5 nm with all the used refractive indexes.Copper NPs produced in ethanol results in µ 1 ∼ 2 nm, w 1 ∼ 0.3, µ 2 ∼ 108 nm, w 2 ∼ 2 nm, and copper NPs produced in acetone results in µ 1 ∼ 2.1 nm, w 1 ∼ 0.2.
In Refs.[20,31], the same NPs are produced with slightly the same process parameters and analyzed with TEM, resulting in µ 1 = 2.1 nm, w 1 = 0.62 for Cu NPs in methanol, µ 1 = 3.3 nm, w 1 = 0.52 for Cu NPs in ethanol and µ 1 = 2.6 nm, w 1 = 0.14 for Cu NPs in acetone.For NPs dispersion produced with PLAL, the mean radius value is strongly dependent on the laser fluence, and some casual fluctuations may occur; in fact, the algorithm finds very similar values.The distribution width (w 1 ) instead maintains the same trend w Methanol > w Ethanol > w Acetone in both the fitted and measured distributions.For the polydispersed fitted distributions, the ratio a 1 /a 2 order of magnitude is 10 3 (methanol) and 10 5 (ethanol), indicating that the number of small particles is greater than that of the bigger ones.The latter majorly contributes to the total cross section, especially in the IR region.In each graph, the blue dots represent the experimental cross section, and the yellow line represents the best fit.In (a-c), the experimental data are obtained using the methodology described in Section 2.1.In (d), the data are adapted from Ref. [13].The inset reports the fitting MSE and distribution parameters.
Lastly, the algorithm is tested on the spectra of gold NPs produced in water with PLAL adapted from Ref. [13], obtaining µ = 3.4 nm, w = 0.2 compared to µ = 3.5 nm, w = 0.05 obtained in their work using the same gold refractive index.Using some other works in the literature and measured refractive indexes, the algorithm converges to the values of µ = 3 nm, w ∼ 0.3.
As the values obtained are comparable, the competitiveness of the developed algorithm, for copper NPs or other metals, is validated.However, the percentage error associated with the parameters (less than 5% for more than 400 iterations) is an underestimation.The main source of error is not the precision of the algorithm but the refractive index itself.The final values of size and distribution spread (µ and w in Tables of Appendix D) .can be comparable to each other independent of the used refractive index if an error of 10% is considered.This value has no catastrophic consequence for the possible applications (for example, an uncertainty of 0.2 nm for the value of 2 nm of the distribution peak) and proves the algorithm's robustness against the biggest source of error.bigger particles (radius up to 250 nm), at least L max = 10 is necessary to converge, especially in the UV as reported in Figure A2.In this work, L max = (2(Integer_part[R/50] + 1) + 1) is used, so the multipole order starts from L max = 3 for smaller particles and increases by 2 every 50 nm radius.This approach is more precise but requires a lot of computational time, especially with bigger particles.At the end of the cycle, the dataset file is exported.

Figure 1 .
Figure 1.Schematic description of the software developed.

Figure 2 .
Figure 2. Schematic description of the fitting algorithm with example graphics of polydisperse copper NPs produced in ethanol with PLAL.

Figure 3 .
Figure 3. Example of the same Monte Carlo gradient descent implemented on a two-variable function for a schematic visualization.On the left, the numbered boxes indicate the parameter values and the range in which random number generators work.On the right, the same numbered points descend toward the minimum of an example function.

Figure 5 .
Figure 5. Best-fitting extinction of copper NPs produced in (a) methanol, (b) ethanol, (c) acetone, and (d) gold NPs produced in water.In each graph, the blue dots represent the experimental cross section, and the yellow line represents the best fit.In (a-c), the experimental data are obtained using the methodology described in Section 2.1.In (d), the data are adapted from Ref.[13].The inset reports the fitting MSE and distribution parameters.

Figure A2 .
Figure A2.Computed extinction cross section for particle of 150 nm and 200 nm radius truncated at different multipole order (L max ).

Table A1 .
Best-fitting parameters and MSE copper NPs produced in methanol, dataset computed with various refractive indexes and various radius ranges.

Table A2 .
Best-fitting parameters and MSE copper NPs produced in ethanol, dataset computed with various refractive indexes and various radius ranges.

Table A3 .
Best-fitting parameters and MSE copper NPs produced in acetone, dataset computed with various refractive indexes and various radius ranges.

Table A4 .
[13]-fitting parameters and MSE gold NPs produced in Water (data adapted from Ref.[13]), dataset computed with various refractive indexes and various radius ranges.